Markup of Korean Dictionary Entries
نویسنده
چکیده
Dictionary markup (encoding) is one of the concerns of TEI (Text Encoding Initiative), an international project for text encoding. In this paper, we investigate ways to use and extend TEI encoding scheme for the markup of Korean dictionary entries. Since TEI suggestions for dictionary markup are mainly for western language dictionaries, we need to cope with problems to be encountered in encoding Korean dictionary entries. We try to extend and modify the TEI encoding scheme in the way suggested by TEI. Also, we restrict the content model so that the encoded dictionary might be viewed more as a database than as a simple computerized, originally printed, dictionary.
منابع مشابه
Semi-automatic Refinement of the JMdict/EDICT Japanese-English Dictionary
The JMdict/EDICT Japanese-English Dictionary is a freely-available dictionary distributed in XML (JMdict)and text (EDICT) formats. It is widely used as a source of lexical material in dictionary systems and text-processing projects. We propose two refinements to make the dictionary more computationally tractable: marking entries where the English is not a translation equivalent and expanding co...
متن کاملEnhancing a Dictionary for Transfer Rule Acquisition
The JMdict/EDICT Japanese-English Dictionary is a freely-available dictionary distributed in XML (JMdict)and text (EDICT) formats. It is widely used as a source of lexical material in dictionary systems and text-processing projects. We propose two refinements to make the dictionary more computationally tractable: marking entries where the English is not a translation equivalent and expanding co...
متن کاملDetecting Structural Irregularity in Electronic Dictionaries Using Language Modeling
Dictionaries are often developed using tools that save to Extensible Markup Language (XML)-based standards. These standards often allow high-level repeating elements to represent lexical entries, and utilize descendants of these repeating elements to represent the structure within each lexical entry, in the form of an XML tree. In many cases, dictionaries are published that have errors and inco...
متن کاملA Methodology for the Analysis of Verb Usage Examples in a Context of Lexical Knowledge Acquisition from Dictionary Entries
This paper presents the development of a methodology the last goal of which would be to help the linguist or lexicographer when defining the subcategorisation pattern of a verb. As a starting point we have applied this methodology over the verb entries examples of an ordinary dictionary in machine readable version. The methodology consists of the following steps: morphological analysis, resolut...
متن کاملJMdict: A Japanese-Multilingual Dictionary
The JMdict project has at its aim the compilation of a multilingual lexical database with Japanese as the pivot language. Using an XML structure designed to cater for a mix of languages and a rich set of lexicographic information, it has reached a size of approximately 100,000 entries, with most entries having translations in English, French and German. The compilation involves information re-u...
متن کامل